CHAPTER 3 Getting Statistical: A Short Review of Basic Statistics 35
»
» The histogram’s y-axis represents the number (or frequency) of individuals in
the data that fall in the numerical ranges (known as classes) of the value being
charted, which are listed across the x-axis. In this case, the y-axis would
represent number of states falling in each class.»
» This histogram’s x-axis represents classes, or numerical ranges of the value
being charted, which is in this case is number of airports.
We first made a histogram of the census, then we took four random samples of 20
states and made a histogram of each of the samples. Figure 3-1 shows the results.
As shown in Figure 3-1, when comparing the sample distributions to the distribu-
tion of the population using the histograms, you can see there are differences.
Sample 2 looks much more like the population than Sample 4. However, they are
all valid samples in that they were randomly selected from the population. The
samples are an approximation to the true population distribution. In addition, the
mean and standard deviation of the samples are likely close to the mean and stan-
dard deviation of the population, but not equal to it. (For a refresher on mean and
standard deviation, see Chapter 9.) These characteristics of sampling error —
where valid samples from the population are almost always somewhat different
than the population — are true of any random sample.
Digging into probability distributions
As described in the preceding section, samples differ from populations because of
random fluctuations. Because these random fluctuations fall into patterns,
FIGURE 3-1:
Distribution of
number of
private and public
airports in
2011 in the
population (of
50 states and the
District of
Columbia), and
four different
samples of
20 states from the
same population.
© John Wiley & Sons, Inc.